Data summary

Our data contains 2323 observations for 25 commuters and 71 commutes. We observed 1 commute for 10 commuters (40%). We observed 2-4 commutes for 9 commuters (36%) and 5-6 commutes for 6 commuters (24%). The number of minutes of total commuting time observed per commuter ranged from 24 to 267 with a median of 54 minutes.

# total observations
nrow(rcomm)
## [1] 2323
# total commuters
unique(rcomm$ID) %>% length()
## [1] 25
# total commutes
select(rcomm, ID, date_local, group) %>% unique() %>% nrow()
## [1] 71
# person-days
select(rcomm, ID, date_local) %>% unique() %>% nrow()
## [1] 45
# days
select(rcomm, date_local) %>% unique() %>% nrow()
## [1] 38
# total minutes of obs per commuter
g1 <- group_by(rcomm, ID) %>% summarize(n=n()) %>% ungroup() %>% arrange(n)
g1 %>%
  kable()
ID n
GMU1005 24
GMU1044 24
GMU1042 25
GMU1043 31
GMU1046 33
GMU1027 34
GMU1022 36
GMU1047 37
GMU1014 47
GMU1050 50
GMU1028 51
GMU1038 51
GMU1012 54
GMU1007 84
GMU1016 94
GMU1035 95
GMU1032 98
GMU1045 119
GMU1040 129
GMU1036 139
GMU1041 167
GMU1001 199
GMU1026 209
GMU1018 226
GMU1037 267
# summary of total minutes of obs per commuter
summarize(g1, min(n), max(n), median(n), mean(n))
## # A tibble: 1 × 4
##   `min(n)` `max(n)` `median(n)` `mean(n)`
##      <int>    <int>       <int>     <dbl>
## 1       24      267          54      92.9
# commutes per commuter
select(rcomm, ID, date_local, group) %>% unique() %>% group_by(ID) %>% count() %>%
  ungroup() %>% rename(commutesobs= n) %>% count(commutesobs) %>% 
  mutate(perc = round(100 * n/sum(n), 1)) %>% kable()
commutesobs n perc
1 10 40
2 1 4
3 5 20
4 3 12
5 4 16
6 2 8

The average commute length observed was approximately 30 minutes with commutes ranging from 1 to 99 minutes.

## # A tibble: 1 × 2
##     min   max
##   <int> <int>
## 1     1    99
ID mean min max median
GMU1001 39.80000 16 50 44.0
GMU1005 24.00000 24 24 24.0
GMU1007 21.00000 17 26 20.5
GMU1012 54.00000 54 54 54.0
GMU1014 47.00000 47 47 47.0
GMU1016 31.33333 15 45 34.0
GMU1018 37.66667 24 63 32.0
GMU1022 36.00000 36 36 36.0
GMU1026 69.66667 37 99 73.0
GMU1027 34.00000 34 34 34.0
GMU1028 17.00000 11 22 18.0
GMU1032 32.66667 21 39 38.0
GMU1035 23.75000 15 34 23.0
GMU1036 27.80000 17 42 27.0
GMU1037 44.50000 26 76 39.5
GMU1038 17.00000 1 31 19.0
GMU1040 25.80000 15 45 23.0
GMU1041 33.40000 16 58 28.0
GMU1042 25.00000 25 25 25.0
GMU1043 31.00000 31 31 31.0
GMU1044 24.00000 24 24 24.0
GMU1045 29.75000 23 40 28.0
GMU1046 16.50000 15 18 16.5
GMU1047 37.00000 37 37 37.0
GMU1050 50.00000 50 50 50.0
## # A tibble: 1 × 4
##   `mean(median)` `median(mean)` `mean(mean)` `median(median)`
##            <dbl>          <dbl>        <dbl>            <dbl>
## 1           33.0           31.3         33.2               31

PM

Number of zeroes

## # A tibble: 2 × 2
##   ID          n
##   <chr>   <int>
## 1 GMU1001    29
## 2 GMU1050     3

Histogram of PM

Histogram of PM by ID

Histogram of log(PM + 0.01) by ID

Summarize

By ID

ID lmean lsd lmin lmax mean sd min max
GMU1001 -0.04 2.05 -4.61 3.09 2.46 2.84 0.00 21.87
GMU1005 1.56 0.17 1.37 1.88 4.80 0.85 3.91 6.52
GMU1007 1.17 0.84 -1.11 2.56 4.28 2.90 0.32 12.89
GMU1012 0.68 0.15 0.47 1.06 1.98 0.32 1.59 2.88
GMU1014 1.57 0.42 0.49 2.54 5.21 2.03 1.62 12.62
GMU1016 1.79 0.58 0.84 3.92 7.42 7.48 2.30 50.22
GMU1018 0.88 0.51 -1.04 2.32 2.75 1.64 0.34 10.12
GMU1022 2.25 0.18 1.82 2.60 9.66 1.71 6.14 13.46
GMU1026 0.75 0.51 -2.19 2.20 2.37 1.20 0.10 9.03
GMU1027 1.30 0.30 0.65 2.08 3.83 1.35 1.91 8.03
GMU1028 2.19 0.44 1.30 3.78 9.93 6.02 3.67 43.78
GMU1032 0.44 0.44 -2.08 1.62 1.67 0.63 0.11 5.05
GMU1035 1.35 1.27 -0.66 3.88 9.32 14.00 0.50 48.25
GMU1036 1.45 0.66 -0.01 2.78 5.22 3.41 0.98 16.05
GMU1037 1.41 0.73 0.25 3.17 5.61 5.49 1.28 23.73
GMU1038 1.61 0.73 1.04 3.89 7.44 10.08 2.81 48.72
GMU1040 0.70 0.27 0.30 2.06 2.09 0.78 1.34 7.87
GMU1041 1.71 0.61 1.05 2.90 6.63 4.05 2.84 18.07
GMU1042 1.79 0.29 1.49 2.36 6.23 1.88 4.41 10.54
GMU1043 0.60 0.27 -0.02 1.11 1.88 0.50 0.97 3.04
GMU1044 2.91 0.15 2.63 3.15 18.60 2.84 13.84 23.28
GMU1045 0.14 0.24 -0.12 1.03 1.17 0.33 0.88 2.79
GMU1046 2.35 0.22 2.00 2.85 10.76 2.44 7.36 17.25
GMU1047 1.58 0.35 1.10 2.25 5.16 1.89 3.00 9.45
GMU1050 0.74 1.46 -4.61 2.62 3.24 2.41 0.00 13.79

Across IDs

The average minute PM2.5 across participants was 5.6 mug/m3 ranging from 0 to 24.6 mug/m3. In general, there was greater variability between participants than within a participant over the commutes, though variability was great for some participants.

name med mean sd
lmean 1.41 1.32 0.72
mean 5.16 5.59 3.91
sd 2.03 3.16 3.27
## # A tibble: 1 × 2
##     min   max
##   <dbl> <dbl>
## 1     0  50.2

Violin plots

Box plots

Commute summaries

Mean by commute

SD by commute

Roadiness/Road type

  1. Possible misclassification

Most observations (N=1212, 52.7%) were for local roads. 590 observations (25.6%) were on highways and 403 (17.5%) were on local connecting roads which are XX. The remainder (N=96, 4.2%) were on ramps, tunnels, or others.

## # A tibble: 4 × 3
##   rtype            n  perc
##   <fct>        <int> <dbl>
## 1 High/SecHigh   603  26  
## 2 LocalConn      403  17.3
## 3 Local         1219  52.5
## 4 Other           98   4.2
ID rtype perc
GMU1026 High/SecHigh 71.3
GMU1040 High/SecHigh 69.8
GMU1042 High/SecHigh 60.0
GMU1045 High/SecHigh 52.1
GMU1050 High/SecHigh 46.0
GMU1037 High/SecHigh 40.8
GMU1005 High/SecHigh 33.3
GMU1035 High/SecHigh 29.5
GMU1044 High/SecHigh 29.2
GMU1038 High/SecHigh 25.5
GMU1027 High/SecHigh 23.5
GMU1041 High/SecHigh 22.2
GMU1028 High/SecHigh 13.7
GMU1036 High/SecHigh 12.9
GMU1016 High/SecHigh 12.8
GMU1022 High/SecHigh 11.1
GMU1047 High/SecHigh 10.8
GMU1043 High/SecHigh 9.7
GMU1032 High/SecHigh 4.1
GMU1014 High/SecHigh 2.1
GMU1007 High/SecHigh 1.2
GMU1012 LocalConn 61.1
GMU1028 LocalConn 56.9
GMU1046 LocalConn 51.5
GMU1032 LocalConn 49.0
GMU1018 LocalConn 35.0
GMU1001 LocalConn 30.2
GMU1041 LocalConn 21.6
GMU1037 LocalConn 18.7
GMU1050 LocalConn 18.0
GMU1026 LocalConn 11.0
GMU1045 LocalConn 8.4
GMU1016 LocalConn 5.3
GMU1014 LocalConn 4.3
GMU1043 LocalConn 3.2
GMU1007 LocalConn 1.2
GMU1007 Local 97.6
GMU1014 Local 93.6
GMU1036 Local 87.1
GMU1047 Local 83.8
GMU1022 Local 80.6
GMU1016 Local 77.7
GMU1043 Local 77.4
GMU1038 Local 72.5
GMU1001 Local 69.3
GMU1027 Local 67.6
GMU1035 Local 65.3
GMU1018 Local 65.0
GMU1044 Local 58.3
GMU1041 Local 50.3
GMU1046 Local 48.5
GMU1032 Local 46.9
GMU1042 Local 40.0
GMU1012 Local 38.9
GMU1045 Local 33.6
GMU1037 Local 30.7
GMU1040 Local 30.2
GMU1050 Local 30.0
GMU1005 Local 29.2
GMU1028 Local 27.5
GMU1026 Local 9.6
GMU1005 Other 37.5
GMU1044 Other 12.5
GMU1037 Other 9.7
GMU1043 Other 9.7
GMU1027 Other 8.8
GMU1022 Other 8.3
GMU1026 Other 8.1
GMU1041 Other 6.0
GMU1050 Other 6.0
GMU1045 Other 5.9
GMU1047 Other 5.4
GMU1035 Other 5.3
GMU1016 Other 4.3
GMU1028 Other 2.0
GMU1038 Other 2.0
GMU1001 Other 0.5

Not much difference if take mode over commute type

rtypeMode n perc
High/SecHigh 17 23.9
LocalConn 12 16.9
Local 41 57.7
Other 1 1.4

Roadiness

Reporting values is not so useful (standardized)

mean sd min max
0 1 -4.59 2.54

ID mean sd min max
GMU1001 -0.72 0.73 -2.13 0.62
GMU1005 0.59 0.25 0.26 0.92
GMU1007 -0.35 0.42 -1.08 0.31
GMU1012 -0.78 0.56 -1.61 0.31
GMU1014 0.32 0.17 -0.18 0.43
GMU1016 0.55 0.20 0.17 0.92
GMU1018 -0.51 0.72 -1.96 0.31
GMU1022 -0.12 0.51 -1.00 0.62
GMU1026 -0.34 1.96 -4.59 2.54
GMU1027 0.06 0.37 -0.90 0.37
GMU1028 -0.77 0.57 -1.53 0.44
GMU1032 -0.67 0.55 -1.55 0.38
GMU1035 0.21 0.80 -1.70 2.35
GMU1036 0.23 0.48 -0.84 0.96
GMU1037 0.12 0.69 -2.01 1.08
GMU1038 -0.26 0.48 -1.01 0.46
GMU1040 0.78 0.54 0.10 1.80
GMU1041 0.78 0.55 -0.28 2.06
GMU1042 -1.27 0.89 -2.41 0.40
GMU1043 -0.22 0.60 -0.88 0.70
GMU1044 0.60 0.16 0.31 0.92
GMU1045 1.14 0.67 -0.02 2.27
GMU1046 -0.35 0.44 -1.34 0.31
GMU1047 0.27 0.54 -1.06 0.62
GMU1050 0.31 0.70 -1.21 1.80

Weather

L1 variables: I created as 1 day lag

Variables

https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt

  • PRCP = Precipitation (tenths of mm)
  • SNOW = Snowfall (mm)
  • TMAX = Maximum temperature (degrees C, original tenths of degrees C)
  • TMIN = Minimum temperature (degrees C, original tenths of degrees C)
  • AWND = Average daily wind speed (m/s, original tenths of meters per second)
  • wdf2, wdf5 direction of fastest wind (2 vs. 5 minutes) (degrees)
  • cat2, cat5 categorical direction of fastest wind (2 vs. 5 minutes)

EDA

name mean sd min max
awnd 3.76 1.67 1.00 9.10
awndL1 3.41 1.83 1.00 9.10
awndL1m 3.69 1.55 1.45 8.70
prcp 29.85 54.79 0.00 204.23
prcpL1 45.31 108.38 0.00 645.98
prcpL1m 28.48 71.22 0.00 328.49
snow 2.94 10.76 0.00 61.09
snowL1 1.18 6.15 0.00 40.56
snowL1m 2.85 8.68 0.00 34.16
tmax 13.47 8.03 2.15 29.22
tmaxL1 12.83 7.60 2.60 28.77
tmaxL1m 12.96 8.43 -2.02 30.24
tmin 3.17 7.24 -12.10 21.05
tminL1 3.06 6.91 -6.72 21.05
tminL1m 2.54 7.93 -15.80 21.05

Snow and precipation: binary

name value n perc
prcpbin 0 18 40.0
prcpbin 1 27 60.0
prcpbinL1 0 16 35.6
prcpbinL1 1 29 64.4
prcpbinL1m 0 11 24.4
prcpbinL1m 1 34 75.6
snowbin 0 40 88.9
snowbin 1 5 11.1
snowbinL1 0 42 93.3
snowbinL1 1 3 6.7
snowbinL1m 0 39 86.7
snowbinL1m 1 6 13.3

By observation

name mean sd min max
awnd 3.50 1.51 1.00 9.10
awndL1 3.37 1.86 1.00 9.10
awndL1m 3.51 1.48 1.45 8.70
group 0.44 0.63 0.00 2.00
prcp 34.18 61.16 0.00 204.23
prcpL1 44.79 94.32 0.00 645.98
prcpL1m 26.29 63.70 0.00 328.49
snow 1.97 8.68 0.00 61.09
snowL1 0.82 4.94 0.00 40.56
snowL1m 2.60 8.02 0.00 34.16
tmax 13.02 7.72 2.15 29.22
tmaxL1 12.26 7.29 2.60 28.77
tmaxL1m 12.74 8.19 -2.02 30.24
tmin 2.75 6.74 -12.10 21.05
tminL1 2.68 6.69 -6.72 21.05
tminL1m 2.23 7.42 -15.80 21.05

Snow and precipation: binary

name value n perc
prcpbin 0 30 42.3
prcpbin 1 41 57.7
prcpbinL1 0 26 36.6
prcpbinL1 1 45 63.4
prcpbinL1m 0 18 25.4
prcpbinL1m 1 53 74.6
snowbin 0 65 91.5
snowbin 1 6 8.5
snowbinL1 0 67 94.4
snowbinL1 1 4 5.6
snowbinL1m 0 61 85.9
snowbinL1m 1 10 14.1

Wind direction

  • Wind dir: 5
cat5sm n perc
SE 17 37.8
NW 23 51.1
Other 5 11.1

  • Wind dir: 2
cat2sm n perc
SE 15 33.3
NW 22 48.9
Other 8 17.8

compare commutes

Possible multiple obs of same commute: